Lightly Supervised Quality Estimation
نویسندگان
چکیده
Evaluating the quality of output from language processing systems such as machine translation or speech recognition is an essential step in ensuring that they are sufficient for practical use. However, depending on the practical requirements, evaluation approaches can differ strongly. Often, reference-based evaluation measures (such as BLEU or WER) are appealing because they are cheap and allow rapid quantitative comparison. On the other hand, practitioners often focus on manual evaluation because they must deal with frequently changing domains and quality standards requested by customers, for which reference-based evaluation is insufficient or not possible due to missing in-domain reference data (Harris et al., 2016). In this paper, we attempt to bridge this gap by proposing a framework for lightly supervised quality estimation. We collect manually annotated scores for a small number of segments in a test corpus or document, and combine them with automatically predicted quality scores for the remaining segments to predict an overall quality estimate. An evaluation shows that our framework estimates quality more reliably than using fully automatic quality estimation approaches, while keeping annotation effort low by not requiring full references to be available for the particular domain.
منابع مشابه
Improving Lightly Supervised Training for Broadcast Transcriptions
This paper investigates improving lightly supervised acoustic model training for an archive of broadcast data. Standard lightly supervised training uses automatically derived decoding hypotheses using a biased language model. However, as the actual speech can deviate significantly from the original programme scripts that are supplied, the quality of standard lightly supervised hypotheses can be...
متن کاملImproving lightly supervised training for broadcast transcription
This paper investigates improving lightly supervised acoustic model training for an archive of broadcast data. Standard lightly supervised training uses automatically derived decoding hypotheses using a biased language model. However, as the actual speech can deviate significantly from the original programme scripts that are supplied, the quality of standard lightly supervised hypotheses can be...
متن کاملA Lightly Supervised Approach to Role Identification in Wikipedia Talk Page Discussions
In this paper we describe an application of a lightly supervised Role Identification Model (RIM) to the analysis of coordination in Wikipedia talk page discussions. Our goal is to understand the substance of important coordination roles that predict quality of the Wikipedia pages where the discussions take place. Using the model as a lens, we present an analysis of four important coordination r...
متن کاملLightly supervised training for risk-based discriminative language models
We propose a lightly supervised training method for a discriminative language model (DLM) based on risk minimization criteria. In lightly supervised training, pseudo labels generated by automatic speech recognition (ASR) are used as references. However, as these labels usually include recognition errors, the discriminative models estimated from such faulty reference labels may degrade ASR perfo...
متن کاملCombining lightly-supervised learning and user feedback to construct and improve a statistical parametric speech synthesiser for Malay
In spite of the learning-from-data used to train the statistical models, the construction of a statistical parametric speech synthesiser involves substantial human effort, especially when using imperfect data or working on a new language. Here, we use lightly-supervised methods for preparing the data and constructing the text-processing front end. This initial system is then iteratively improve...
متن کامل